Toward perfect reads
نویسندگان
چکیده
We propose a new method to correct short reads using de Bruijn graphs, and implement it as a tool called Bcool. As a first step, Bcool constructs a corrected compacted de Bruijn graph from the reads. This graph is then used as a reference and the reads are corrected according to their mapping on the graph. We show that this approach yields a better correction than kmer-spectrum techniques, while being scalable, making it possible to apply it to human-size genomic datasets and beyond. The implementation is open source and available at github.com/Malfoy/BCOOL
منابع مشابه
Sensitivity of Perfect and Stone-Wales Defective BNNTs Toward NO Molecule: A DFT/M06-2X Approach
The monitoring and controlling of environmental pollutions are very important in biological and industrial processes, and a great interest is growing with the development of suitable gas–sensitive materials and hazardous chemical removal devices. In this work, the highly parameterized, empirical exchange–correlation functional M06–2X were employed to investigate the electronic sensitivity of pe...
متن کاملHaplotype Inference from Single Short Sequence Reads Using a Population Genealogical History Model
High-throughput sequencing is currently a major transforming technology in biology. In this paper, we study a population genomics problem motivated by the newly available short reads data from high-throughput sequencing. In this problem, we are given short reads collected from individuals in a population. The objective is to infer haplotypes with the given reads. We first formulate the computat...
متن کاملEvaluation of window cohabitation of DNA sequencing errors and lowest PHRED quality values.
When analyzing sequencing reads, it is important to distinguish between putative correct and wrong bases. An open question is how a PHRED quality value is capable of identifying the miscalled bases and if there is a quality cutoff that allows mapping of most errors. Considering the fact that a low quality value does not necessarily indicate a miscalled position, we decided to investigate if win...
متن کاملALLPATHS: de novo assembly of whole-genome shotgun microreads.
New DNA sequencing technologies deliver data at dramatically lower costs but demand new analytical methods to take full advantage of the very short reads that they produce. We provide an initial, theoretical solution to the challenge of de novo assembly from whole-genome shotgun "microreads." For 11 genomes of sizes up to 39 Mb, we generated high-quality assemblies from 80x coverage by paired 3...
متن کاملOn using Longer RNA-seq Reads to Improve Transcript Prediction Accuracy
Over the past decade, sequencing read length has increased from tens to hundreds and then to thousands of bases. Current cDNA synthesis methods prevent RNA-seq reads from being long enough to entirely capture all the RNA transcripts, but long reads can still provide connectivity information on chains of multiple exons that are included in transcripts. We demonstrate that exploiting full connect...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.03336 شماره
صفحات -
تاریخ انتشار 2017